feat: CON-1665 Include NiDKG dealings into blocks based on priority#9766
feat: CON-1665 Include NiDKG dealings into blocks based on priority#9766
Conversation
| return false; | ||
| } | ||
| if *cap > 0 { | ||
| *cap -= 1; |
There was a problem hiding this comment.
It could be worth documenting that this line is responsible for including at most collection_threshold dealings for a given DKG ID.
In general (but I think this could be done later), I feel like it would be easier to read and be more efficient to stop validating more dealings after collecting the threshold, rather than "artificially" limiting the number of included dealings in the payload builder here. Then we should not need to perform this if-check nor this *cap -= 1, while also saving work by validating less dealings (i.e. optimizing at the source).
There was a problem hiding this comment.
I feel like it would be easier to read and be more efficient to stop validating more dealings after collecting the threshold
It's a good idea, unfortunately there isn't really a guarantee that all nodes on the subnet will validate the same (2)f+1 dealings. So we would probably still have to do this accounting, otherwise it could be that one node includes one set of f+1 dealings, and another node includes a different set of f+1 dealings.
Asynchronously validating all dealings in the DKG client is also beneficial for the finalization rate. When validating a block, we will first check if we already have the same dealing in our own validated DKG pool. If so, we will not validate it again as part of the block payload.
There was a problem hiding this comment.
Ah yes I see... Thank you
| assert_eq!(selected.len(), 1); | ||
| assert_eq!(selected[0].content.dkg_id, remote_target_0); | ||
| } | ||
| } |
There was a problem hiding this comment.
Some ideas for further tests:
- 1 Local Low + 1 Remote Low + 1 Remote High, both remotes with the same target ID. Remote Low has reached capacity but not Remote High. Expected result: Remote High should be prioritized even if
MAX_REMOTE_DKGS_PER_INTERVAL = 1and one remote instance was completed.
Effectively, this tests that the variableremaining_remote_target_capacityis indexed byNiDkgTargetIdand not byNiDkgIdto correctly deduceremote_target_ids_with_no_capacity.
Co-authored-by: Pierugo Pace <pierugo.pace@dfinity.org>
…ichhorl/sort-dealings
Background
At the start of each interval, nodes broadcast NiDKG dealings for key generation and resharing. These dealings are included into data blocks. Once enough dealings exist on chain, they may be aggregated into a full NiDKG transcript. #8469 allows the subnet to aggregate dealings not only in the following summary block, but already as part of data blocks, as soon as enough dealings exist on the parent chain.
Problem
Due to size and validation time, the number of dealings to be included on chain is limited (currently to 1 dealing per block). Additionally, the current behavior of selecting dealings to be included is sub-optimal:
N) dealings on chain, even if onlyf+1(fresh key) or2f+1(high threshold re-sharing) dealings are required to create the transcript.Proposed Changes
This PR proposes to stop including more dealings for a config once the config's
collection_thresholdhas been reached. Additionally, dealings receive priority to be included based on the following conditions (in order of precedence):MAX_REMOTE_DKGS_PER_INTERVAL(currently set to 1) completed remote targets in the current interval, prioritize remote dealings. Otherwise, prioritize local dealings. This is to improve the latency of remote DKG requests such asSetupInitialDKGandReshareChainKey. The number of prioritized remote DKGs is limited to avoid starving local DKG. Note that currently, the number of remote DKG configs per interval is also limited to 1 (by the same constant). This may however disappear in the future.Future Work
Instead of only starting to work on remote DKG requests at the beginning of an interval, we will do so also as part of data blocks, as soon as the remote DKG request appears in the replicated state.